{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "# Julia is fast\n",
    "(Originally from https://juliabox.com under tutorials/intro-to-julia/short-version/05.Julia_is_fast.ipynb)\n",
    "\n",
    "Very often, benchmarks are used to compare languages.  These benchmarks can lead to long discussions, first as to exactly what is being benchmarked and secondly what explains the differences.  These simple questions can sometimes get more complicated than you at first might imagine.\n",
    "\n",
    "The purpose of this notebook is for you to see a simple benchmark for yourself.  One can read the notebook and see what happened on the author's Macbook Pro with a 4-core Intel Core I7, or run the notebook yourself.\n",
    "\n",
    "(This material began life as a wonderful lecture by Steven Johnson at MIT: https://github.com/stevengj/18S096/blob/master/lectures/lecture1/Boxes-and-registers.ipynb.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Outline of this notebook\n",
    "\n",
    "- Define the sum function\n",
    "- Implementations & benchmarking of sum in...\n",
    "    - C (hand-written)\n",
    "    - python (built-in)\n",
    "    - python (numpy)\n",
    "    - python (hand-written)\n",
    "    - Julia (built-in)\n",
    "    - Julia (hand-written)\n",
    "    - Julia (hand-written with SIMD)\n",
    "- Summary of benchmarks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# `sum`: An easy enough function to understand"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "Consider the  **sum** function `sum(a)`, which computes\n",
    "$$\n",
    "\\mathrm{sum}(a) = \\sum_{i=1}^n a_i,\n",
    "$$\n",
    "where $n$ is the length of `a`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1×10000000 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}:\n",
       " 0.306833  0.0475944  0.892107  0.500887  0.283666  …  0.550952  0.768903  0.607982  0.777695  0.76583"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = rand(10^7) # 1D vector of random numbers, uniform on [0,1)\n",
    "a'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.999318811582287e6"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The expected result is 5 * 10^6, since the mean of each entry is 0.5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Benchmarking a few ways in a few languages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "using Pkg\n",
    "for p in (\"BenchmarkTools\",\"Plots\",\"PyCall\",\"Conda\")\n",
    "    haskey(Pkg.installed(),p) || Pkg.add(p)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0.009428 seconds (5 allocations: 176 bytes)\n",
      "  0.004772 seconds (5 allocations: 176 bytes)\n",
      "  0.003664 seconds (5 allocations: 176 bytes)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4.999318811582287e6"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@time sum(a)\n",
    "\n",
    "@time sum(a)\n",
    "\n",
    "@time sum(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "The `@time` macro can yield noisy results, so it's not our best choice for benchmarking!\n",
    "\n",
    "Luckily, Julia has a `BenchmarkTools.jl` package to make benchmarking easy and accurate:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "using BenchmarkTools"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#  1. The C language\n",
    "\n",
    "C is often considered the gold standard: difficult on the human, nice for the machine. Getting within a factor of 2 of C is often satisfying. Nonetheless, even within C, there are many kinds of optimizations possible that a naive C writer may or may not get the advantage of.\n",
    "\n",
    "The current author does not speak C, so he does not read the cell below, but is happy to know that you can put C code in a Julia session, compile it, and run it. Note that the `\"\"\"` wrap a multi-line string."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "c_sum (generic function with 1 method)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using Libdl\n",
    "\n",
    "C_code = \"\"\"\n",
    "#include <stddef.h>\n",
    "double c_sum(size_t n, double *X) {\n",
    "    double s = 0.0;\n",
    "    for (size_t i = 0; i < n; ++i) {\n",
    "        s += X[i];\n",
    "    }\n",
    "    return s;\n",
    "}\n",
    "\"\"\"\n",
    "\n",
    "const Clib = tempname()   # make a temporary file\n",
    "# compile to a shared library by piping C_code to gcc\n",
    "# (works only if you have gcc installed):\n",
    "open(`gcc -std=c99 -fPIC -O3 -msse3 -xc -shared -o $(Clib * \".\" * Libdl.dlext) -`, \"w\") do f\n",
    "    print(f, C_code) \n",
    "end\n",
    "# define a Julia function that calls the C function:\n",
    "c_sum(X::Array{Float64}) = ccall((\"c_sum\", Clib), Float64, (Csize_t, Ptr{Float64}), length(X), X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.9993188115826165e6"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c_sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c_sum(a) ≈ sum(a) # type \\approx and then <TAB> to get the ≈ symbolb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.296881914138794e-7"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c_sum(a) - sum(a)  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "isapprox (generic function with 8 methods)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "≈  # alias for the `isapprox` function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "search: \u001b[0m\u001b[1mi\u001b[22m\u001b[0m\u001b[1ms\u001b[22m\u001b[0m\u001b[1ma\u001b[22m\u001b[0m\u001b[1mp\u001b[22m\u001b[0m\u001b[1mp\u001b[22m\u001b[0m\u001b[1mr\u001b[22m\u001b[0m\u001b[1mo\u001b[22m\u001b[0m\u001b[1mx\u001b[22m\n",
      "\n"
     ]
    },
    {
     "data": {
      "text/latex": [
       "\\begin{verbatim}\n",
       "isapprox(x, y; rtol::Real=atol>0 ? 0 : √eps, atol::Real=0, nans::Bool=false, norm::Function)\n",
       "\\end{verbatim}\n",
       "Inexact equality comparison: \\texttt{true} if \\texttt{norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))}. The default \\texttt{atol} is zero and the default \\texttt{rtol} depends on the types of \\texttt{x} and \\texttt{y}. The keyword argument \\texttt{nans} determines whether or not NaN values are considered equal (defaults to false).\n",
       "\n",
       "For real or complex floating-point values, if an \\texttt{atol > 0} is not specified, \\texttt{rtol} defaults to the square root of \\href{@ref}{\\texttt{eps}} of the type of \\texttt{x} or \\texttt{y}, whichever is bigger (least precise). This corresponds to requiring equality of about half of the significand digits. Otherwise, e.g. for integer arguments or if an \\texttt{atol > 0} is supplied, \\texttt{rtol} defaults to zero.\n",
       "\n",
       "\\texttt{x} and \\texttt{y} may also be arrays of numbers, in which case \\texttt{norm} defaults to \\texttt{vecnorm} but may be changed by passing a \\texttt{norm::Function} keyword argument. (For numbers, \\texttt{norm} is the same thing as \\texttt{abs}.) When \\texttt{x} and \\texttt{y} are arrays, if \\texttt{norm(x-y)} is not finite (i.e. \\texttt{±Inf} or \\texttt{NaN}), the comparison falls back to checking whether all elements of \\texttt{x} and \\texttt{y} are approximately equal component-wise.\n",
       "\n",
       "The binary operator \\texttt{≈} is equivalent to \\texttt{isapprox} with the default arguments, and \\texttt{x ≉ y} is equivalent to \\texttt{!isapprox(x,y)}.\n",
       "\n",
       "Note that \\texttt{x ≈ 0} (i.e., comparing to zero with the default tolerances) is equivalent to \\texttt{x == 0} since the default \\texttt{atol} is \\texttt{0}.  In such cases, you should either supply an appropriate \\texttt{atol} (or use \\texttt{norm(x) ≤ atol}) or rearrange your code (e.g. use \\texttt{x ≈ y} rather than \\texttt{x - y ≈ 0}).   It is not possible to pick a nonzero \\texttt{atol} automatically because it depends on the overall scaling (the \"units\") of your problem: for example, in \\texttt{x - y ≈ 0}, \\texttt{atol=1e-9} is an absurdly small tolerance if \\texttt{x} is the \\href{https://en.wikipedia.org/wiki/Earth_radius}{radius of the Earth} in meters, but an absurdly large tolerance if \\texttt{x} is the \\href{https://en.wikipedia.org/wiki/Bohr_radius}{radius of a Hydrogen atom} in meters.\n",
       "\n",
       "\\section{Examples}\n",
       "\\begin{verbatim}\n",
       "julia> 0.1 ≈ (0.1 - 1e-10)\n",
       "true\n",
       "\n",
       "julia> isapprox(10, 11; atol = 2)\n",
       "true\n",
       "\n",
       "julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])\n",
       "true\n",
       "\n",
       "julia> 1e-10 ≈ 0\n",
       "false\n",
       "\n",
       "julia> isapprox(1e-10, 0, atol=1e-8)\n",
       "true\n",
       "\\end{verbatim}\n"
      ],
      "text/markdown": [
       "```\n",
       "isapprox(x, y; rtol::Real=atol>0 ? 0 : √eps, atol::Real=0, nans::Bool=false, norm::Function)\n",
       "```\n",
       "\n",
       "Inexact equality comparison: `true` if `norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))`. The default `atol` is zero and the default `rtol` depends on the types of `x` and `y`. The keyword argument `nans` determines whether or not NaN values are considered equal (defaults to false).\n",
       "\n",
       "For real or complex floating-point values, if an `atol > 0` is not specified, `rtol` defaults to the square root of [`eps`](@ref) of the type of `x` or `y`, whichever is bigger (least precise). This corresponds to requiring equality of about half of the significand digits. Otherwise, e.g. for integer arguments or if an `atol > 0` is supplied, `rtol` defaults to zero.\n",
       "\n",
       "`x` and `y` may also be arrays of numbers, in which case `norm` defaults to `vecnorm` but may be changed by passing a `norm::Function` keyword argument. (For numbers, `norm` is the same thing as `abs`.) When `x` and `y` are arrays, if `norm(x-y)` is not finite (i.e. `±Inf` or `NaN`), the comparison falls back to checking whether all elements of `x` and `y` are approximately equal component-wise.\n",
       "\n",
       "The binary operator `≈` is equivalent to `isapprox` with the default arguments, and `x ≉ y` is equivalent to `!isapprox(x,y)`.\n",
       "\n",
       "Note that `x ≈ 0` (i.e., comparing to zero with the default tolerances) is equivalent to `x == 0` since the default `atol` is `0`.  In such cases, you should either supply an appropriate `atol` (or use `norm(x) ≤ atol`) or rearrange your code (e.g. use `x ≈ y` rather than `x - y ≈ 0`).   It is not possible to pick a nonzero `atol` automatically because it depends on the overall scaling (the \"units\") of your problem: for example, in `x - y ≈ 0`, `atol=1e-9` is an absurdly small tolerance if `x` is the [radius of the Earth](https://en.wikipedia.org/wiki/Earth_radius) in meters, but an absurdly large tolerance if `x` is the [radius of a Hydrogen atom](https://en.wikipedia.org/wiki/Bohr_radius) in meters.\n",
       "\n",
       "# Examples\n",
       "\n",
       "```jldoctest\n",
       "julia> 0.1 ≈ (0.1 - 1e-10)\n",
       "true\n",
       "\n",
       "julia> isapprox(10, 11; atol = 2)\n",
       "true\n",
       "\n",
       "julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])\n",
       "true\n",
       "\n",
       "julia> 1e-10 ≈ 0\n",
       "false\n",
       "\n",
       "julia> isapprox(1e-10, 0, atol=1e-8)\n",
       "true\n",
       "```\n"
      ],
      "text/plain": [
       "\u001b[36m  isapprox(x, y; rtol::Real=atol>0 ? 0 : √eps, atol::Real=0, nans::Bool=false, norm::Function)\u001b[39m\n",
       "\n",
       "  Inexact equality comparison: \u001b[36mtrue\u001b[39m if \u001b[36mnorm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))\u001b[39m. The default \u001b[36matol\u001b[39m is\n",
       "  zero and the default \u001b[36mrtol\u001b[39m depends on the types of \u001b[36mx\u001b[39m and \u001b[36my\u001b[39m. The keyword argument \u001b[36mnans\u001b[39m determines whether or\n",
       "  not NaN values are considered equal (defaults to false).\n",
       "\n",
       "  For real or complex floating-point values, if an \u001b[36matol > 0\u001b[39m is not specified, \u001b[36mrtol\u001b[39m defaults to the square root\n",
       "  of \u001b[36meps\u001b[39m of the type of \u001b[36mx\u001b[39m or \u001b[36my\u001b[39m, whichever is bigger (least precise). This corresponds to requiring equality of\n",
       "  about half of the significand digits. Otherwise, e.g. for integer arguments or if an \u001b[36matol > 0\u001b[39m is supplied,\n",
       "  \u001b[36mrtol\u001b[39m defaults to zero.\n",
       "\n",
       "  \u001b[36mx\u001b[39m and \u001b[36my\u001b[39m may also be arrays of numbers, in which case \u001b[36mnorm\u001b[39m defaults to \u001b[36mvecnorm\u001b[39m but may be changed by passing a\n",
       "  \u001b[36mnorm::Function\u001b[39m keyword argument. (For numbers, \u001b[36mnorm\u001b[39m is the same thing as \u001b[36mabs\u001b[39m.) When \u001b[36mx\u001b[39m and \u001b[36my\u001b[39m are arrays, if\n",
       "  \u001b[36mnorm(x-y)\u001b[39m is not finite (i.e. \u001b[36m±Inf\u001b[39m or \u001b[36mNaN\u001b[39m), the comparison falls back to checking whether all elements of \u001b[36mx\u001b[39m\n",
       "  and \u001b[36my\u001b[39m are approximately equal component-wise.\n",
       "\n",
       "  The binary operator \u001b[36m≈\u001b[39m is equivalent to \u001b[36misapprox\u001b[39m with the default arguments, and \u001b[36mx ≉ y\u001b[39m is equivalent to\n",
       "  \u001b[36m!isapprox(x,y)\u001b[39m.\n",
       "\n",
       "  Note that \u001b[36mx ≈ 0\u001b[39m (i.e., comparing to zero with the default tolerances) is equivalent to \u001b[36mx == 0\u001b[39m since the\n",
       "  default \u001b[36matol\u001b[39m is \u001b[36m0\u001b[39m. In such cases, you should either supply an appropriate \u001b[36matol\u001b[39m (or use \u001b[36mnorm(x) ≤ atol\u001b[39m) or\n",
       "  rearrange your code (e.g. use \u001b[36mx ≈ y\u001b[39m rather than \u001b[36mx - y ≈ 0\u001b[39m). It is not possible to pick a nonzero \u001b[36matol\u001b[39m\n",
       "  automatically because it depends on the overall scaling (the \"units\") of your problem: for example, in \u001b[36mx - y\n",
       "  ≈ 0\u001b[39m, \u001b[36matol=1e-9\u001b[39m is an absurdly small tolerance if \u001b[36mx\u001b[39m is the radius of the Earth\n",
       "  (https://en.wikipedia.org/wiki/Earth_radius) in meters, but an absurdly large tolerance if \u001b[36mx\u001b[39m is the radius of\n",
       "  a Hydrogen atom (https://en.wikipedia.org/wiki/Bohr_radius) in meters.\n",
       "\n",
       "\u001b[1m  Examples\u001b[22m\n",
       "\u001b[1m  ≡≡≡≡≡≡≡≡≡≡\u001b[22m\n",
       "\n",
       "\u001b[36m  julia> 0.1 ≈ (0.1 - 1e-10)\u001b[39m\n",
       "\u001b[36m  true\u001b[39m\n",
       "\u001b[36m  \u001b[39m\n",
       "\u001b[36m  julia> isapprox(10, 11; atol = 2)\u001b[39m\n",
       "\u001b[36m  true\u001b[39m\n",
       "\u001b[36m  \u001b[39m\n",
       "\u001b[36m  julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])\u001b[39m\n",
       "\u001b[36m  true\u001b[39m\n",
       "\u001b[36m  \u001b[39m\n",
       "\u001b[36m  julia> 1e-10 ≈ 0\u001b[39m\n",
       "\u001b[36m  false\u001b[39m\n",
       "\u001b[36m  \u001b[39m\n",
       "\u001b[36m  julia> isapprox(1e-10, 0, atol=1e-8)\u001b[39m\n",
       "\u001b[36m  true\u001b[39m"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "?isapprox"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "We can now benchmark the C code directly from Julia:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  0 bytes\n",
       "  allocs estimate:  0\n",
       "  --------------\n",
       "  minimum time:     10.158 ms (0.00% GC)\n",
       "  median time:      10.180 ms (0.00% GC)\n",
       "  mean time:        10.188 ms (0.00% GC)\n",
       "  maximum time:     10.538 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          491\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c_bench = @benchmark c_sum($a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "C: Fastest time was 10.158363 msec\n"
     ]
    }
   ],
   "source": [
    "println(\"C: Fastest time was $(minimum(c_bench.times) / 1e6) msec\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 1 entry:\n",
       "  \"C\" => 10.1584"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = Dict()  # a \"dictionary\", i.e. an associative array\n",
    "d[\"C\"] = minimum(c_bench.times) / 1e6  # in milliseconds\n",
    "d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "using Plots, Statistics\n",
    "default(fmt = :png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlgAAAGQCAIAAAD9V4nPAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3de3xT9eH/8c9J7zd6X6FIi8gdvlpB3ORmFUWRlYK/74QpwzFnRSwg+yoXEVBRV3AyHbopahXdwH1xFLnMy5AhKswvDBRlpVBxUGpbSlvaJr2kyfn8/sjsutqcQpOctP28ng/+aE8++Zx3PknzJjnNqSalFAAAqMri7wAAAPgTRQgAUBpFCABQGkUIAFAaRQgAUBpFCABQGkUIAFAaRQgAUBpFCABQGkUIAFBapy7CwsJCq9XagSs2NTV5PQwEC+szLKyPsLA+0s0WtlMX4c9//vODBw924IqVlZW6rns9DyoqKjg5rS+cO3fO3xG6ISllRUWFv1N0Q7quV1ZW+juFN3XqIgQAwNcoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0gL9HQDwm5lZ8z7cs9t4zJIFc++77z5z8qATcjgc03927//9bZ/BGE2Ixx9ePGvWLNNSwbsoQqhr48vPy5WHhKa5HbHnxbNnz5qYCJ1OfX39ts0bHUuNijDo/V9xbqAujSKE2pKHCM39AYKoBBOjoJPSNItIHmo0IDLetDDwBY4RAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlMaZZQD37HUbt+z+vy+OGQwJsYitb/3RtEQAvI4iBNz7+mBhRN/CxEyjMevvEIIiBLowihAw1HuYGPUjowHr7zArCgCf4BghAEBpFCEAQGkUIQBAaRQhAEBpFCEAQGkUIQBAaRQhAEBp5hXh22+/PXz48JiYmPHjxx8/fty1ccyYMdq35syZY1oYAABcTPpA/enTp2fOnPn++++npaU9//zzs2fP/uSTT6SUx44dO3PmTHR0tBAiMJBP9wMAzGZS95w8eXLGjBnXXHONEOLOO+/MyckRQpSVldnt9szMzIKCggkTJqxfvz40NLTlterr61955ZVdu3a1mm3IkCGZmUZnvbLZbGFhYRYLb/x6mc1ms1qtmqb5O4h3SC/NY7VaPZzBtbBeCYNmUkrPF9Zms7W/I6fjnXfeqaysNBgTGhp6//33e5Kk89B1vRM+YkNDQzv8asqkIkxPT09PTxdCOJ3OFStWTJ8+XQhRWlo6atSotWvXpqSkLFy4cMGCBZs2bWp1RafT6XA4Wm10OBxSGj2JyW958zbg24X1d4pOx/M1YWF9wSvPAxdydedXn+4Ki9n1ufsRjsaQj9YtWLDAkySdR/d7gjX13chdu3YtWrRo4sSJjz/+uBAiLS1t9+7drotycnKGDRvWanxYWFhWVparQS9KXV1dVFQUrwi9zmazRUVFdZtXhJqXXhRGRUV5OIPVavV8ErQipXQ9Ffh6R5qmieETxY3uX/DVV2sfr+82d7Gu6w0NDd3m5gjTilBK+dBDD33yySdvvvnmwIEDXRsPHTrU0NAwevRoIURwcHBISIg5YQAAaGbSa6Z9+/bl5eVt27YtOTnZarW63ly22WzTpk3Lz8+32+2rVq2aOnWqOWEAAGhm0ivCPXv2FBQUxMbGNm+RUo4dO3blypUZGRnV1dW33HLLunXrzAkDAEAzk14RLlu2TP4nIYSmaXPnzi0sLCwvL9+wYUOPHj3MCQMAQDN+nQQAoDSKEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDSKEACgtEB/BwC6vIKCAuMBCQkJ8fHx5oQBcLEoQsAzYT1G3TDF4HJ7TcXSX8xbuXKlaYkAXBSKEPBMfU3tynNGA7Y9ZlYUAB3BMUIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQsDHKk4/+tiqgMAgg38pqX39nRJQF+caBXys4p9y8lI5eanbAVLqc6NMDATgP1CEgO9pmggIcnup1E2MAqA13hoFACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAozbwifPvtt4cPHx4TEzN+/Pjjx4+7NlZVVWVkZMTFxU2ZMqWqqsq0MAAAuJhUhKdPn545c+ZLL71UUlIyZcqU2bNnu7avXr06NTW1pKQkJSVlzZo15oQBAKCZSUV48uTJGTNmXHPNNWFhYXfeeWdBQYFre15eXnZ2dkhISHZ29pYtW8wJAwBAs0BzdpOenp6eni6EcDqdK1asmD59umt7cXFxamqqEML1urDVtWw226OPPvrb3/621farrrrq7rvvNthddXV1YGCgxcIRUC+rrq4ODg7WNM3fQbxD+jtASxwa8DopZXV1dUhIiCeTWK3Wdh8pUrb3UJJ6kxZ0y623GQxpqG+w2qwJCQkGY2IjI557dm07+/I9XdddTwX+DvIfIiMjg4KCOnZdk4rQZdeuXYsWLZo4ceLjjz/u2iKldD2rSimdTmer8UFBQVdfffXgwYNbbU9JSQkLCzPYUWhoaFhYGEXoda6F7TZF2KkYP6TRAVJK1yPWk0kcDocQ7Tzg2/+JaLA662reiZtsMMSy43H90qtF3ES3I+x1YVsfemX979rZl+/puu75wnqdJ0/4JhWhlPKhhx765JNP3nzzzYEDBzZvT05OLioqGjBgQHFxce/evVtdKzg4eNKkSa6XkhclJCQkNDSUIvQ618J2myLUOtOLwtDQUH9H6G6klK5HrCeTNDU1eSeNJUD84HaDy7Xd60TqCKMxdVVi60Od4XGi67rnC9upmFQV+/bty8vL27ZtW3JystVqtVqtru0ZGRm5ublSytzc3MzMTHPCAADQzKRXhHv27CkoKIiNjW3e4npXfcWKFXfccUefPn1GjBjxxhtvmBMGAIBmJhXhsmXLli1b9t3tMTExO3fuNCcDAADfxVE0AIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0ihCAIDSKEIAgNIoQgCA0gL9HQDwlR07dkgpDQYYXWa67du3Gw/45ptvkpOTDQZYLJbJkyd7NRSgBIoQ3dPLL788b+WakOSBRoMMa9I8UoqgsJ889qLBkLrCAzIkKqLPYINJaj97z+lweD8e0N1RhOieSkpKGkf8qCHzEaNBB4NNStMOKex11ffkGQ1Zc70YOqH6h8vcDtAdlnujvJ4MUAHHCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASqMIAQBKowgBAEqjCAEASjO1CJ1O5+DBg1tuGTNmjPatOXPmmBkGAAAhRKBpe3r22Wc3btxYUFDQvEVKeezYsTNnzkRHRwshAgPNCwMAgEsbrwg1TSstLW255dChQ3FxcR7u6fLLL1++fHnLLWVlZXa7PTMzs1evXjNnzqypqfFwFwAAXKx/vwizWq1nzpxxfX3ixInz5883X7R3796mpiYP93Tddde12lJaWjpq1Ki1a9empKQsXLhwwYIFmzZtajmgtrY2KysrMjKy1RWvvfbaZcuWGeyrsrJS0zSLhSOgXlZZWRkQEKBpmr+DtK+urk61Q+BSiHPnzvk7RScipaysrPTwrSar1SqEbHdHnuziwkkhje/ivx/+/L5fLDKeJCEmakfeW57E0HXd9VTgySRe16NHj+Dg4I5d998PkT179mRkZLi+Hj9+fMtBAQEBCxcu7HA+d9LS0nbv3u36OicnZ9iwYa0GhIeHz549e8SIEa22x8fHu95NdaehoSE6Opoi9Lr6+vro6OguUYQhISFCePq/t65FE8L450I1UkrXU4Enk1gsFiHaecCb9hOhCc345mzekvdV8rXiqv92O6KhpuSVOzxcE13X7XZ7Z3uweVLM/y7CH/7wh67/12iaVlJS0rNnTy9EM3To0KGGhobRo0cLIYKDg0NCQloNCAgIGD58uGvARQkKCgoKCqIIvc61sF2iCAMCAlQrQiFEUFCQvyN0IlJK1yPWk0k625Ia57FYLCKuj7h0lNsR1op2J2mXruueL2yn0kZVSClNaEEhhM1mmzZtWn5+vt1uX7Vq1dSpU03YKQAALbX97rnNZisqKmq1sdUnHzw3duzYlStXZmRkVFdX33LLLevWrfPu/AAAtKuNInzzzTfvvPNOu93eartXDgi3nETTtLlz586dO9fzaQEA6Jg23hpdunTpj3/849raWvmfzA8HAICvtVGE1dXVc+bM+e6HFgAA6H7aKMIbbrjhyJEj5kcBAMB8bRwjfOCBB+bPn19fXz927NiIiIjm7V7/ZRkAAPyujSL8/ve/L4T49NNPW23nMCEAoPtpowgpPACAOjj3CgBAaRQhAEBpbbw16u5MkrxlCgDoftoowvz8/Oav6+rq9u/f//LLL+fl5ZmYCgAAk7RRhK0+JjFixIjw8PCsrKz333/frFQAAJjkgo4RJicn/+1vf/N1FAAAzNfGK8Jjx461/NZmsz311FN9+vQxKxIAAOZpowiHDBnSaktqampubq4peQAAMBUfqAcAKI3PEQIAlNZ2EW7fvn3cuHEJCQlxcXHjxo3buXOnybEAADBHG0W4efPmW2+9ddy4cVu3bnU1YmZm5pYtW8wPBwCAr7VxjPDJJ59ctGjRE0884fp2zJgxuq4/8cQTt956q7nZAMC3mpqafvOb3xgMaGxs1HXdtDzwizaK8Pjx408++WTLLddee+1zzz1nViQAMMmyFY+szfvIculItyMa65xOp4mJ4AdtFGFqaurRo0cnTZrUvOXLL79MTU01MRUAmKGxsdH5X5OdN/3C7YjqEvHRBhMTwQ/aKMKsrKzly5cnJSVNnjxZCLFz587HHnts1apVpmcDAMDn2ijC+fPnOxyOhQsXzpo1SwgRHx+/YsWK+fPnm54NAACfa6MILRbLAw888D//8z/l5eVCiMTERHd/mAkAgK6u7c8RHjly5A9/+MP3vve9733vezk5OV9++aXJsQAAMEcbRfjRRx+NGDFi48aNrm/feuutkSNH/uUvfzE3GAAAZmijCJcuXZqZmbljxw7XtwcOHJgxY8by5cvNDQYAgBnaKMIjR47cfvvtFsu/LrJYLD/+8Y95dxQA0C21UYTJycklJSUttxQWFl5yySVmRQIAwDxt/Nbovffeu3z58oSEhIkTJ1oslg8++GDlypVLliwxPxwAAL7W9ucINU1buHBhaWmpECIuLs71aQrTs0Fdf/nLX4xP8FhSUtKzZ0+DD/YUFhYKodzbGO+9957xgAEDBvTr18+cML7mdDp37dplMEBKefz48UGDBhmMOXXqlLAkejsaupg2ilDTtPnz58+bN6+iosLhcCQlJfE5Qphpw4YNc5Y+FtLzMoMx1f/YHz3kB8L9I9NW+Hc5/h4fpOusdKceFDp96dMGQxpLCh/OvmvZsmWmhfKpZ9Y9//BTz4Uk9XU3QDqaak78PXrIDwwmsRUeEjdd5f1w6FLaKEIXTdMSEhLMjAK4nDlzxj7itoapjxkNygqpvnebsLh9AIunb/J6sM5NCnt99VyjPx1q2brCtDQmKCoqahj9s4ab3L9ZVVMmFl/Wzpr8cqz3k6Gr4S/UAwCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlOb+xBwAup2ampozZ84YDNA0rXfv3qblAToDihBQhSw9/vQH7z336h8MxtirzjY12U2LBHQGFCGgjJoy582L6iYvdTvAYQ+YF2tiIKBT4BghAEBpFCEAQGkUIQBAaRQhAEBpFCEAQGkUIQBAaRQhAEBpphah0+kcPHhwyy1VVVUZGRlxcXFTpkypqqoyMwwAAMLMInz22WdHjx5dUFDQcuPq1atTU1NLSkpSUlLWrFljWhgAAFzMK8LLL798+fLlrTbm5eVlZ2eHhIRkZ2dv2bLFtDAAALiYd4q166677rsbi4uLU1NThRCu14WtLj1//vzkyZMDA1uHnDRp0rPPPmuwr/Lycl3XLRaOgHpZeXm5EELTNJ/uxWq1+nR+GJBClJWV+TvFBamrqxMi3t8phBBCSmnajozvnfr6es8naZeu6xUVFZ7M4AsxMTEhISEdu66fzzUqpXQ9q0opnU5nq0ujo6M3bdo0duzYVtuDg4PDw8MNptV1PTExkSL0OqfTmZiY6OsijIiIEMLm013AHU2IxMREf6e4IGFhYf6O8C++/olouSPjeyc0NFS0V4XtTtIuXddF53ucePKE7+ciTE5OLioqGjBgQHFx8Xf/+IumaT169IiLi7vYaS3f8lJM/ItrVX39Y2/a0wra1FV+cNR8nBjfOxe4Jp7fxd3sCdbPtyQjIyM3N1dKmZubm5mZ6d8wAAAF+bkIV6xYceTIkT59+hw9evThhx/2bxgAgILMfmu01VHlmJiYnTt3mpwBAIBm3edNXgAAOoAiBAAojSIEACiNIgQAKI0iBAAojSIEACjNz2eWAQBcEN3RFBAybNQYgyElxcXi2gWmJeo2KEIA6AoarA5b9T8mPGkwJOCNOabF6U4oQgDoIjRN9B9tNCDY6K8RwB2OEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJTGmWUA/JsMDPnJXfcYj7njR9Nuvvlmc/IAJqAIAXzLYdcb635vv8JgiOWzbUP7H6YI0Z1QhABa0DRx7d1Gl1eeMi0LYA6OEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJRGEQIAlEYRAgCURhECAJQW6O8AAADzSN25e/du4zGDBg3q3bu3OXk6A4oQAJTRUNvgkP9v4SqjId+ceGrlkuzsbNNC+R1FCADKcDqk7jw/732DIaF/XGBanE6CY4QAAKVRhAAApVGEAAClUYQAAKVRhAAApVGEAAClUYQAAKX5uQjHjBmjfWvOnDn+DQMAUJA/P1AvpTx27NiZM2eio6OFEIGBfLofAGA2f3ZPWVmZ3W7PzMwsKCiYMGHC+vXrQ0NDWw6QUlZUVJSUlLS6YmhoqKs73dF1Xdd17ydWnmthNU3z6V6klD6dHx6SUprw82W1Wm02m8EAm80mRLyvY6jJ+C7Wv2VmpHZZLB1/g9OfRVhaWjpq1Ki1a9empKQsXLhwwYIFmzZtajmgpqbmrrvuCgoKanXFm2+++Ve/+pXBzBUVFcKzdUGbKioqLBaLr4vQ+OkP/iWltNls5eXlPt1LU1PTmBsnny1t/Z/glux1VjFlhU9jXKBu9l83KaXVajW4i3Vddz0VmJmqXTExMSEhIR27rj+LMC0trfkk6Dk5OcOGDWs1IDo6euvWrenp6R2YPDExsbPdT92AlDIpKcnXRRgZGSmE1ae7QIdpmhYZGZmUlOTTvdTV1X3z9QnHc9UGYyw543ya4cL5+ifCZJqmRUVFGdzFuq5bLBZfPwbM5M+qOHTo0L59+1xfBwcHd7jMAQDoMH8Woc1mmzZtWn5+vt1uX7Vq1dSpU/0YBgCgJn8W4dixY1euXJmRkdG7d++qqqrVq1f7MQwAQE3+PEaoadrcuXPnzp3rxwwAAMXx6yQAAKVRhAAApVGEAAClUYQAAKVRhAAApVGEAACl8QcfcBFm3jP/04/+ajBAdzqcUgsKDDAYE6iJs6XffPcUslDHh3s/+uHUW41GSOlwOMyKA9VRhLgIe97Z5nzimAgMdjvixTu0S4Y3Tl5qMIllUd9udpJiXKxt23dYx9wjrnf/GeK6KrHyShMTQWkUIS5SZLwICnV7aWCQCAoVUYkmBkLXFBJu9DjROGoD8/BoAwAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0iBAAojSIEACiNIgQAKI0zy8BsMiTy6mtv1Cxu/xNWduaUHH6bmZHgRVLK+UtWfPzxxwZjvjl9Snz/56ZFgvn+fvjzn2ffbzwmOTFu59Y/mZPHGEUIs8m66s+vWSIs7h97by3RTMwD72psbPzt2tX6gp0GY7RTC03LA7947ncvfhb2XyItw+2IuvNf/THbxERGKEL4w8DxRmfuDo81MQq8T7NYxOB0owGhUZx2vfvrOcDoYVBTZl6S9nCMEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDSKEACgNIoQAKA0ihAAoDTOLAPgYjjsW7b/+avTxe4udzqduq6bmQjeJZsaNm7e8tnRY24HSHny5NeXXdbPYJJP9v9NXDHEB+l8giIEcBH0k58ejIg/WD3A7YimBqFzArUuzPHPz/Yn9ttvcBfXlIk9u/ck3GQwiaV0p7jC+9l8hCIEcJH6fV9cP9ftpXXnxZaHTUwDHxgwzuguLjkmPlhnNEAI7ZPXvJ3JhzhGCABQGkUIAFAaRQgAUBpFCABQGkUIAFAaRQgAUBpFCABQGkUIAFAaRQgAUBpnllHF559/vn//fuMxRUVFffr0MRggJeeQBGCS+vr6DRs2GI+JiIj4yU9+4uGOKEJVjL3+RuflPxQWt/e4/cj7IqxH8IAfGEzCyZQBmOaBpctfevfTwN5uT94t7XUhBR9QhLhQjTZr049+LYLD3Y44m2HpPaz+v3OMZtmT6/VgANCmWqu1adSMpmuz3I6oLgnO+b7nO+IYIQBAaRQhAEBpFCEAQGkUIQBAaRQhAEBpFCEAQGkUIQBAaX4uwqqqqoyMjLi4uClTplRVVfk3DABAQX4uwtWrV6emppaUlKSkpKxZs8a/YQAACvLzmWXy8vLefvvtkJCQ7OzszMzMX/7yly0vdTgchw8f/u5pvRISEgYNGmQwrd1ub2xstFh44/c/fbVfBIa4vbSuSlYVixMftzNJ4T4RGOx+kvPi/IVNEuD+sVd/XlQVtT/JiU+EJcBwkjMXNImmuZ+kWlScvoBJDAfUVYtKw5vjeoQbT9JQ004Sh73dSWR9e5PYG4SU7SWxiopTRmMarEK0M4lsqBXnjCepaTeJbLCKc/80GmOrbP/mNLY3ibVSSL2dJO1OUnO2/UnsdaL8a8NJSoXeziSi3UmqioXuvIBJThqNqSxqdxLZVCfOGk5Sd17XHbt37zaY5Gz5WdHwlfG947A3PPnkk0KIqVOnDh061GA2A5qUsmPX9IrIyMjy8vKwsLD6+vqkpKSampqWl44cObKhoaFHjx6trjVu3LglS5YYTFteXh4fH08RtjR56q3VdXaDATWV55xaQGxsrMGYU1+dSOnXX3PfHNWV5VILjDGc5PTJE30uNZykolwGBMbEGE7y1YmUfv0NOuxCJjn11fHUywYIYTxJUExMTHuTDDQYcP7cWS0wONpwkn8WHu/bv71JgoKjo91PIuXpk4Uplw1ob5KQ6Oho93PIoq8LU/p5PslXKf36G09iCQrpYTSJXvT1SU8n0Z1F/zzZzs0pL7MEhxpOop85dbLPpUZJqspLA0LCv/tM1UzX9eJTX/e59DKDSSrLywJDwgwncRaf+mc7k5wtDQoNjzKYxOksLjrVp28/4yRBoeFRUVFuJ3E4iotOtZ8kLMJgEiFEU21VQpzRz0VVdY0l3O1d43Ku6GtHU6MQYtWqVffdd5/xYHf8/IpQSul6QpRSOp3OVpdGRUU9/fTT6enpFzttU1NTXFwcRdjS/r17PJ+ktLQ0KSnJoMPQMSUlJb169fJ3iu5GSllWVtazZ09/B+ludF0vLy9PSkrydxCv8XNVJCcnFxUVCSGKi4t79+7t3zAAAAX5uQgzMjJyc3OllLm5uZmZmf4NAwBQkJ+LcMWKFUeOHOnTp8/Ro0cffvhhb01bVlbm32Of3VVpaam/I3RPZWVl/o7QDbneGvV3im5I1/WzZ8/6O4U3+fmXZYylp6c/8sgjHThGGBMTU1hYmJCQ4INQSgsJCTl//nxYWJi/g3Qruq4HBgbyR4+9rra2tmfPnjabzd9BupuysrKhQ4dWVFT4O4jX8OskAAClUYQAAKVRhAAApfn5c4TG7Hb7O++88/XXX1/sFZuamjZt2hQZGemLVCrTdf31118PDnZ/ZhlcPNdx+ldffdXfQbqbhoYGh8PBwnpdTU1NY2NjZ1vYCRMmpKSkdOy6nfqXZfbt2/fCCy8EBLg/jZYbNpstPDycz317ndVq5b8XvsDC+ggL6wtSyrq6uoiICH8H+Q/z5s0bMWJEx67bqYsQAABf4xghAEBpFCEAQGkUIQBAaRQhAEBpXa8InU7n4MGDW26pqqrKyMiIi4ubMmVKVVXVhVxlzJgx2rfmzJnj28RdhFcW1uFwzJ07NzExccyYMcXFxb5N3BV4ZVW17/Bt6K7AKwv74YcfpqWlRUVFpaWl7d2717eJuwjPF/bs2bMzZ87s1avXJZdckpWVVVtb6/PQHutiRfjss8+OHj26oKCg5cbVq1enpqaWlJSkpKSsWbOm3atIKY8dO3bmzJna2tra2tpnnnnGjOidm1cWVgjxzDPP1NTUnDp1avTo0StXrvR57s7NW6ta28Ly5csXL17s8+idm7cWdubMmcuWLausrHzooYdmzpzp89ydnlcW9mc/+9mll1566tSpwsLC2NjYRx55xITknpJdyu7du7dv394q9sCBA/Pz86WU+fn5AwcObPcqJSUlkZGRI0eOjIyMzMzMdP2pCsV5ZWGllFdeeeVnn30mpaypqTl48KCPU3d23lGF9gsAAAe3SURBVFrVZkeOHJkwYUJTU5OPAncV3lrYoUOHvvTSS5WVlS+//PKQIUN8Hbvz88rCRkZGnj9/3vV1ZWVlamqqb0N7QxcrQpdW91NERERdXZ2Usq6uLioqqt2rHD58+Lrrrjt8+HBFRcWsWbNmzJjh07RdiIcLK6WMi4tbvHhxbGzsyJEjjxw54ruoXYjnq+rS2Nh49dVXHz161BchuyLPF/bAgQPNLwkOHDjgu6hdi4cLm56evmTJkqqqqrKysvnz5wcHB/s0rVd0sbdG2ySldB01kVI6nc52x6elpe3evTstLS0uLi4nJ+e9997zfcYu6WIXVghRU1MjpTx69OjNN9989913+zhgl9SBVXV5+umnr7766qFDh/osWtfWgYVdvHjxokWLvvnmmwcffHDJkiU+DthVXezCvvbaa66/MnvNNdf069cvLi7O9xk91R2KMDk5uaioSAhRXFzcu3fvdscfOnRo3759rq+Dg4NDQkJ8m6/LutiFFUIkJibef//9vXr1ys7O/vLLL30csEvqwKoKIZxO5wsvvLBgwQJfRuvaOrCwn3766cKFC3v16rV48eJPP/3UxwG7qotd2IiIiLy8vNra2q+++iotLW3QoEG+z+ip7lCEGRkZubm5Usrc3NzMzEzXxj179rgbb7PZpk2blp+fb7fbV61aNXXqVJOCdjUXu7BCiJtuuum1115rbGxcv379VVddZUbKrqYDqyqE2L17d58+ffr37+/zfF1WBxb28ssvf+WVV6xW6+uvv37FFVeYkbILutiFXbRo0T333FNTU1NSUrJkyZL58+ebFNQT5r8b67lWsauqqm655ZbevXtnZGQ0H6RtNablt7quP//885dddllCQsKsWbOqq6tNyNwleLiwUsqSkpIbbrghOjp6/PjxJ06c8HXgLsHzVZVS3n777Y8++qhPc3Y5ni9sfn7+6NGjIyMjR48e7fp9EEiPF/bcuXNTpkzp0aPHkCFD1q9fb0Jgz3HSbQCA0rrDW6MAAHQYRQgAUBpFCABQGkUIAFAaRQgAUBpFCABQGkUIAFAaRQgAUBpFCABQGkUIAFAaRQgAUBpFCPiKpmnHjh278K87m2PHjrn+EB3QvVGEAAClBfo7ANBt1dbWhoeHez4GgE9RhICvREZGemUMAJ/irVHgommatnfv3szMzLi4uLS0tP3793/00UfXXHNNZGTkgAED3n333eZh7R7/aznmiy++mDRpUlxcXHR09MSJE5u3SylfeOGFIUOGhIeHX3nllRs2bGj+M6JSyt/85jdDhw7t0aPHuHHjPv74Y9d2XdefeeaZYcOGRUZGjhw58q233mq5xwMHDkyfPj0uLq5fv35//OMfmy/au3fv+PHjo6OjL7vssp/+9KcVFRXNF7nLBnQDFCHQEfPmzbvtttt27NjRq1evSZMm3XvvvQ8++OC7777bt2/fOXPmdGBCp9N50003xcfHv/DCC+vXrw8ODp41a5brot///vfr1q1bunTpli1bbr755rvuuuvFF190XbRu3boVK1bce++9r776amJiYnp6+pEjR4QQa9euXb58+U9/+tPNmzdPnDhxxowZO3bsaBn++uuv37lz57hx42bNmmWz2YQQH3/8cXp6emJi4osvvpiTk9PU1HTLLbe0mw3oDnzzh++B7kwI8b//+7+urw8cOCCEOHz4cMtvm4fl5+df4NdFRUVCiH/84x+u7eXl5a+//rrr6xEjRhQWFjbv/Re/+MXYsWOllLqu9+zZc+PGja7tTqdz0qRJb7zxhq7r8fHxrheOLkuWLBk3blzzHn/961+7vq6pqWkOcP3118+ePbvlzbz77rtdt8UgG9ANUITARWvZCvn5+UIIXddbfts87MKL0Ol0zp49OzIycsqUKatXrz59+nTz7iIiIlr9/zUpKUlKWV5eLoQ4d+5cq3hlZWVCiIqKiuYte/fujY+Pb97jwYMHW94WV4DY2Ng9e/a0nOevf/2r67YYZAO6Ad4aBTqi1QfsPP+8ncViyc3NLSwsvPHGGw8cODB06NBFixa5LgoPDz906FB+C3v27BFCOBwOIURAQMCFTO50Opu//W6zCiECA1v/6lzzjTLIBnQDFCHQKVRVVWVlZcXFxWVnZ2/evHnz5s2/+93vXBcNGzasuLh48ODBgwcPHjRo0Lp16zZs2CCESEpKio2N/eCDD1zDdF2/6qqrnnzyycTExPj4+D//+c/Nk+/cuXP48OHGAa644orXXnut5ZaNGze2mw3oBvj4BNAp9OjRY9u2bTabbfr06Q0NDbm5uSNHjnRdtHDhwttvv/2xxx7r27dvXl7e66+/vn37diGEpmkPPvhgVlZWaWlp//79N23a9MUXX7z22muapi1evHju3LllZWVDhw7du3fvU089tXXrVuMAjz766NixY2tqam677bbAwMC8vLy9e/e2mw3oDvz93izQ9YgWB/laHhSUHhwjlFLu379/9OjRERERsbGx06ZNa3ko7pVXXhk8eHB4ePjIkSP/9Kc/NW93Op05OTn9+/cPDw8fNWrUe++917x97dq1rk9cpKWlvfXWW22Gb/Xthx9+OG7cuOjo6L59+2ZlZR08eLD5thhkA7o6TX77gSQAABTEMUIAgNL+PwJldbrMDip0AAAAAElFTkSuQmCC"
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t = c_bench.times / 1e6 # times in milliseconds\n",
    "m, σ = minimum(t), std(t)\n",
    "\n",
    "histogram(t, bins=500,\n",
    "    xlim=(m - 0.01, m + σ),\n",
    "    xlabel=\"milliseconds\", ylabel=\"count\", label=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 2. Python's built in `sum` "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The `PyCall` package provides a Julia interface to Python:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "using PyCall"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "PyObject <built-in function sum>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Call a low-level PyCall function to get a Python list, because\n",
    "# by default PyCall will convert to a NumPy array instead (we benchmark NumPy below):\n",
    "\n",
    "apy_list = PyCall.array2py(a)\n",
    "\n",
    "# get the Python built-in \"sum\" function:\n",
    "pysum = pybuiltin(\"sum\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.9993188115826165e6"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pysum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pysum(a) ≈ sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  48 bytes\n",
       "  allocs estimate:  3\n",
       "  --------------\n",
       "  minimum time:     34.508 ms (0.00% GC)\n",
       "  median time:      34.537 ms (0.00% GC)\n",
       "  mean time:        34.609 ms (0.00% GC)\n",
       "  maximum time:     37.440 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          145\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "py_list_bench = @benchmark $pysum($apy_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 2 entries:\n",
       "  \"C\"               => 10.1584\n",
       "  \"Python built-in\" => 34.5081"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Python built-in\"] = minimum(py_list_bench.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 3. Python: `numpy` \n",
    "\n",
    "## Takes advantage of hardware \"SIMD\", but only works when it works.\n",
    "\n",
    "`numpy` is an optimized C library, callable from Python.\n",
    "It may be installed within Julia as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "using Conda"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "PyObject array([0.3068332 , 0.04759437, 0.89210733, ..., 0.60798161, 0.77769452,\n",
       "       0.76582976])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numpy_sum = pyimport(\"numpy\").\"sum\"\n",
    "apy_numpy = PyObject(a) # converts to a numpy array by default"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.9993188115822775e6"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numpy_sum(apy_list) # python thing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  48 bytes\n",
       "  allocs estimate:  3\n",
       "  --------------\n",
       "  minimum time:     3.471 ms (0.00% GC)\n",
       "  median time:      3.493 ms (0.00% GC)\n",
       "  mean time:        3.520 ms (0.00% GC)\n",
       "  maximum time:     5.440 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          1419\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "py_numpy_bench = @benchmark $numpy_sum($apy_numpy)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numpy_sum(apy_list) ≈ sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 3 entries:\n",
       "  \"C\"               => 10.1584\n",
       "  \"Python numpy\"    => 3.47099\n",
       "  \"Python built-in\" => 34.5081"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Python numpy\"] = minimum(py_numpy_bench.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 4. Python, hand-written "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "PyObject <function py_sum at 0x7f9461d16e18>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "py\"\"\"\n",
    "def py_sum(A):\n",
    "    s = 0.0\n",
    "    for a in A:\n",
    "        s += a\n",
    "    return s\n",
    "\"\"\"\n",
    "\n",
    "sum_py = py\"py_sum\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.9993188115826165e6"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum_py(apy_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  48 bytes\n",
       "  allocs estimate:  3\n",
       "  --------------\n",
       "  minimum time:     183.032 ms (0.00% GC)\n",
       "  median time:      184.175 ms (0.00% GC)\n",
       "  mean time:        184.283 ms (0.00% GC)\n",
       "  maximum time:     186.966 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          28\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "py_hand = @benchmark $sum_py($apy_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum_py(apy_list) ≈ sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 4 entries:\n",
       "  \"C\"                   => 10.1584\n",
       "  \"Python numpy\"        => 3.47099\n",
       "  \"Python hand-written\" => 183.032\n",
       "  \"Python built-in\"     => 34.5081"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Python hand-written\"] = minimum(py_hand.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 5. Julia (built-in) \n",
    "\n",
    "## Written directly in Julia, not in C!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "sum(a::<b>AbstractArray</b>) in Base at <a href=\"https://github.com/JuliaLang/julia/tree/80516ca20297a67b996caa08c38786332379b6a5/base/reducedim.jl#L648\" target=\"_blank\">reducedim.jl:648</a>"
      ],
      "text/plain": [
       "sum(a::AbstractArray) in Base at reducedim.jl:648"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@which sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.999318811582287e6"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  0 bytes\n",
       "  allocs estimate:  0\n",
       "  --------------\n",
       "  minimum time:     3.248 ms (0.00% GC)\n",
       "  median time:      3.270 ms (0.00% GC)\n",
       "  mean time:        3.299 ms (0.00% GC)\n",
       "  maximum time:     4.588 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          1514\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "j_bench = @benchmark sum($a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 5 entries:\n",
       "  \"C\"                   => 10.1584\n",
       "  \"Python numpy\"        => 3.47099\n",
       "  \"Python hand-written\" => 183.032\n",
       "  \"Python built-in\"     => 34.5081\n",
       "  \"Julia built-in\"      => 3.24809"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Julia built-in\"] = minimum(j_bench.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 6. Julia (hand-written) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "mysum (generic function with 1 method)"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function mysum(A)   \n",
    "    s = 0.0 # s = zero(eltype(a))\n",
    "    for a in A\n",
    "        s += a\n",
    "    end\n",
    "    s\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.9993188115826165e6"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mysum(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  0 bytes\n",
       "  allocs estimate:  0\n",
       "  --------------\n",
       "  minimum time:     10.164 ms (0.00% GC)\n",
       "  median time:      10.184 ms (0.00% GC)\n",
       "  mean time:        10.197 ms (0.00% GC)\n",
       "  maximum time:     10.598 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          491\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "j_bench_hand = @benchmark mysum($a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 6 entries:\n",
       "  \"C\"                   => 10.1584\n",
       "  \"Python numpy\"        => 3.47099\n",
       "  \"Julia hand-written\"  => 10.1639\n",
       "  \"Python hand-written\" => 183.032\n",
       "  \"Python built-in\"     => 34.5081\n",
       "  \"Julia built-in\"      => 3.24809"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Julia hand-written\"] = minimum(j_bench_hand.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# 7. Julia (hand-written w. simd) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "mysum_simd (generic function with 1 method)"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function mysum_simd(A)   \n",
    "    s = 0.0 # s = zero(eltype(A))\n",
    "    @simd for a in A\n",
    "        s += a\n",
    "    end\n",
    "    s\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.999318811582319e6"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mysum_simd(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "scrolled": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  0 bytes\n",
       "  allocs estimate:  0\n",
       "  --------------\n",
       "  minimum time:     3.246 ms (0.00% GC)\n",
       "  median time:      3.267 ms (0.00% GC)\n",
       "  mean time:        3.283 ms (0.00% GC)\n",
       "  maximum time:     4.775 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          1521\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "j_bench_hand_simd = @benchmark mysum_simd($a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dict{Any,Any} with 7 entries:\n",
       "  \"Julia hand-written simd\" => 3.24612\n",
       "  \"C\"                       => 10.1584\n",
       "  \"Python numpy\"            => 3.47099\n",
       "  \"Julia hand-written\"      => 10.1639\n",
       "  \"Python hand-written\"     => 183.032\n",
       "  \"Python built-in\"         => 34.5081\n",
       "  \"Julia built-in\"          => 3.24809"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[\"Julia hand-written simd\"] = minimum(j_bench_hand_simd.times) / 1e6\n",
    "d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "scrolled": true,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Julia hand-written simd.....3.2\n",
      "Julia built-in..............3.2\n",
      "Python numpy................3.5\n",
      "C..........................10.2\n",
      "Julia hand-written.........10.2\n",
      "Python built-in............34.5\n",
      "Python hand-written.......183.0\n"
     ]
    }
   ],
   "source": [
    "for (key, value) in sort(collect(d), by=last)\n",
    "    println(rpad(key, 25, \".\"), lpad(round(value, digits=1), 6, \".\"))\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "anaconda-cloud": {},
  "colab": {
   "collapsed_sections": [],
   "name": "julia.ipynb",
   "provenance": [],
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Julia 1.1.0",
   "language": "julia",
   "name": "julia-1.1"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.1.0"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "212px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": "2",
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
